(19) 



J 



Eur paisches Paten tamt 
European Patent Offic 
Off) uropeen d br vets 



H 



(12) 



(43) Date of publication: 

06.12.2000 Bulletin 2000/49 

(21) Application number: 00304356.9 

(22) Date of filing: 23.05.2000 



(n) EP 1 058 446 A2 

EUROPEAN PATENT APPLICATION 

(51) int CI7: H04M 3/533 



(84) 


Designated Contracting States: 


(72) 


Inventors: 




AT BE CH CY DE DK ES Fl FR GB GR IE IT LI LU 


• 


Lee, Chin-Hui 




MC NL FT SE 




Basking Ridge, N.J. 09720 (US) 




Designated Extension States: 


• 


Ramesh, Pad ma 




AL LT LV MK RO SI 




New Provedence, N.J. 07974 (US) 


(30) 


Priority: 03.06.1999 US 325143 


(74) 


Representative: 








Buckley, Christopher Simon Thirsk et al 


(71) 


Applicant: LUCENT TECHNOLOGIES INC. 




Lucent Technologies (UK) Ltd, 




Murray Hill, New Jersey 07974-0636 (US) 




5 Mornington Road 








Woodford Green, Essex IG8 0TU (GB) 



(54) Key segment spotting in voice messages 

(57) A method and system of identifying and spot- 
ting segments containing key information in voice mes- 
sages. The method can be used to spot a key segment 
such as a name segment in a voice message by detect- 
ing and verifying the presence of a phrase such as "My 
name is ..." or "This is Once the key segment of in- 



terest has been spotted, the method provides the user 
with only the pertinent information (e.g., the name of the 
caller), which is contained in the key segment. This al- 
lows a user retrieving a message to hear just a desired 
section or sections of a message without listening to the 
rest of the message. 



CM 
< 
CD 

3 

CO 

in 



Q_ 

LU 



Printed by Jouve. 75001 PARIS (FR) 



1 



EP 1 058 446 A2 



2 



Description 
Related Applicati ns 

[0001] The present application is related to U.S. Pat- 
ent Application Serial No. (Attorney 

Docket No. Lee 23-2), entitled VOICE MESSAGE FIL- 
TERING FOR CLASSIFICATION OF VOICE MES- 
SAGE ACCORDING TO CALLER, filed on even date 
herewith and incorporated herein by reference in its en- 
tirety. 

Field Of The invention 

[0002] The present invention relates to voice messag- 
ing systems and methods, in particular, key segment 
spotting in voice messages. 

Background Information 

[0003] In voice messaging (or "voice-mail") systems, 
a user is often forced to listen to multiple, often lengthy 
messages to obtain certain items of essential informa- 
tion such as the names of the callers who have left the 
messages and the callers' return telephone numbers. 
This can be atedious and time-consuming process. Fur- 
thermore, the manual process of transcribing the essen- 
tial information is susceptible to errors. 

Summary Of The Invention 

[0004] The present invention is directed to a method 
and system of identifying and spotting segments con- 
taining key information in voice messages. For example, 
the method of the present invention can be used to spot 
a name segment in a voice message by detecting and 
verifying the presence of a segment such as "My name 
is ..." or "This is The method can also be used to 
spot a phone number segment by detecting and verify- 
ing the presence of a segment such as "My number is ... 
" or "Call me back at ..." or by spotting the numerical part 
of the message such as "[my number is] 3-6-4-7-5-8-9". 
Once the key segment of interest has been spotted: the 
method or system of the present invention can provide 
the user with only the pertinent information (e.g., the 
name o1 the caller) contained in the key segment. The 
method of the present invention can spot the key seg- 
ments and can then retrieve only the desired segments. 
This allows a user retrieving a message to hear just a 
desired section or sections of a message without having 
to listen to the rest of the message. 
[0005] The method of the present invention is advan- 
tageously useful in sorting through a large number of 
voice mail messages. The method spe ds up the proc- 
ess of s arching for particular messag s, m ssages 
from particular callers, or for certain segments within 
messag s. 



Brj f Description Of The Drawing 

[0006] FIG. 1 illustrates a key segm nt registration 
procedure in accordance with the present invention. 
5 [0007] FIG. 2 illustrates the handlingof voice messag- 
es in accordance with the present invention. 
[0008] FIG. 3 illustrates the retrieval of key segments 
and messages with key segments, in accordance with 
the present invention. 

10 

Detailed Description 

[0009] In an exemplary embodiment of a method in 
accordance with the present invention, key segment 
spotting is achieved by first having a user register the 
key segments he would tike to spot in the messages. 
This procedure is illustrated in FIG. 1. As shown, the 
registration of key segments can be done by text input 
(e.g., if a keyboard is available, the user can type in the 

20 Key segment to be registered) or by voice input (e.g., 
the user speaks the key segment to be registered). 
[0010] Also, the user may register a key segment by 
using part of an actual voice message. As shown in FIG. 
1 , a user. , while playing back a stored voice message, 

2S can mark at 1 3 a key segment within the message, by 
pressing* for example, the "B" key to mark the beginning 
of the key segment and the 'E" key to mark the end of 
the key segment. By pressing a f urther key sequence, 
e.g., **S, the user can indicate that the marked segment, 

30 delimited with the B and E key presses, is to be regis- 
tered. This feature is useful, for example, for saving the 
names of the message sender as spoken by the senders 
in order to spot them later. 

[0011] Commonly occurring key segments such as 

35 name segments, phone number segments and date 
segments may be provided without registration as pre- 
defined segments. As discussed below, such prede- 
fined segments can be retrieved by pressing predefined 
key sequences. 

40 [0012] As shown in FIG. 1, the user can input a key 
segment to be registered either as text, speech, pronun- 
ciation or by marking a segment within a message. Text 
can be entered, for example, with an alphanumeric key 
pad (not shown)., keyboard or any other such text-entry 

45 device. A speech representation of a key segment can 
be entered, for example, via the audio path of a tele- 
phone (such as the user might use todial into the system 
of the present invention.) The pronunciation can be 
specified using any set of symbols, such as the IPA sym- 

so bol set. The symbols can be entered, for example, as 
text. 

[0013] If entered as text, the text of the key segment 
is processed at 11 through a text-to-speech front end to 
obtain the pronunciation of th key s gment. For xam- 
ss pi©, if the user enters the word "four", th text-tc-speech 
front end would g nerat the IPA symbol sequenc f- 
ow-r to represent the pronunciation. If the user speaks 
the key segment or marks the key segment in a mes- 
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sage, the key segment is processed at 12 to generate 
its pronunciation using sp ech recognition. 
[0014] An id ntifier of th key segment (e.g., a seg- 
ment name) and the corresponding characteristics (e. 
g., the pronunciation) of th key segment are stored at 
1 5 in a storage device or memory. The text-to-speech 
and speech recognition functions can be implemented 
in conventional ways using known methods and sys- 
tems. For example, the speech recognition function 12 
can be implemented in accordance with the methods 
and systems described in U.S. Patents Nos. 4,713,777, 
4,718,088, 5,509,104, 5,579,436, and/or 5,649,057. 
The text-to-speech function can be implemented as de- 
scribed in "Multilingual Text-to-Speech Synthesis: The 
Sell Labs Approach,' by R. W. Sproat, Kluwer Academic 
Publishers, 1998. 

[0015] . As voice messages are received, the messag- 
es are processed as illustrated in FIG. 2 in order to 
search for registered and/or predefined key segments. 
Using the key segment characteristics stored at 1 5 and 
speaker-independent models (for the sound units of the 
pronunciation) key segment detection is performed at 
21 to spot one or more registered or predefined key seg- 
ments in a voice message. The key segment detection 
at 21 can be implemented in a known way using con- 
ventional wordspotting or phrase detection technology, 
such «s described in U.S. Patent No. 5,509,104. 
[001* To enhance the accuracy of the key segment 
detection, utterance verification is performed at 23 on 
the key segments detected at 21 . Utterance verification 
is used to confirm that the segments detected at 21 con- 
tain the information that is sought. Utterance verification 
can be performed as described, for example, in U.S. 
Patent No. 5,675,706. The messages are then tagged 
at 25 with the key segments and the locations of the key 
segments in the messages to facilitate their later retriev- 
al. In one exemplary embodiment: each message is 
stored with a header containing tag information. The tag 
information, for example, may indicate the locations of 
key segments detected within the message. The loca- 
tion of each key segment can be represented, for exam- 
ple, as an offset in time or address space from the be- 
ginning of the message. 

[001 7] Messages in wh ich no registered or predefined 
key segments are detected can be stored in a conven- 
tional manner without being tagged and can be retrieved 
in a conventional manner. 

[0018] Once one or more messages have been 
tagged and stored, the messages and/or key segments 
within the messages can be retrieved. An exemplary 
message retrieval procedure in accordance with the 
present invention is illustrated in FIG. 3. 
[001 9] The retrieval procedure is initiated when a user 
nters an enquiry for a key segment. The enquiry can 
be ntered by a variety of means, including speech (i. 
e., speaking the desired key segment), by typing the 
name or pronunciation of th key segment, or by press- 
ing a sequence of one or more buttons on a keypad, 



wherein th s quence identifies the desired key seg- 
ment. 

[0020] Upon receiving the user enquiry for a key seg- 
ment, the procedure first determines at 31 whether th 
5 user has entered the enquiry by speech, i.e., if the/user 
has spoken the name of the key segment. If so r opera- 
tion proceeds to 33 in which speech recognition is per- 
formed on the spoken enquiry to determine the segment 
name spoken. 

w [0021] Operation then proceeds to 35 in which it is de- 
termined if the specified key segment is one that has 
been predefined or already registered. If the key seg- 
ment to which the user's enquiry pertains is registered 
or is one of the predefined segments, operation pro- 
fs ceeds to 37 in which a search for the specified key seg- 
ment is performed in the tagged messages. At 39, the 
specified key segment is retrieved from those messages 
in which it was found. If the enquired-about key segment 
is found in multiple messages, each occurrence of the 

20 key segment is retrieved. 

[0022] To access predefined key segments, the user 
may press predefined key sequences on the user's tel- 
ephone dial pad, such as, **T for the telephone number 
segment, **N for the name segment, **D for the date 

25 segment, «nd so on. Furthermore, telephone number 
detection with **T can include number verification. A 
number retrieved from a segment of a message can op- 
tionally be dialed by pressing a predefined key se- 
quence (e.g., **C). 

30 [0023] If it is determined at 35 that the key segment 
to which the user's enquiry pertains is a new key seg- 
ment (i.e., ft is not a predefined or registered segment), 
the characteristics (e.g., pronunciation) of the key seg- 
ment are first obtained at 36 with the procedure of FIG. 

35 1 . The stored messages are then tagged at 38, as per 
the message handler procedure of FIG. 2, to indicate 
where, if at ail, the newly specified key segment is found 
in the stored messages. Once the messages have been 
tagged with respect to the new key segment, the key 

40 segment is retrieved at 39, as described above. 

[0024] When a key segment is retrieved at 39 from a 
message, the user can opt to save the retrieved key seg- 
ment for future use as a key segment by pressing a pre- 
defined sequence of keys (e.g., **S). Furthermore, if a 

45 name segment is retrieved, it can be used to identify the 
caller and hence can be used for message filtering and 
classification of messages according to the caller. This 
enables the system of the present invention to save for 
example, the message sender's name in their own voice 

50 for later use in identifying, tagging and retrieving the 
sender's messages. 

[0025] The present invention uses speech recogni- 
tion, wordspotting, key- word detection and utterance 
verification technologies for spotting key segments in 
55 messag s. It can also use sp ch coding technology for 
key segment spotting in coded voic mail messages. 
[0026] The present invention can be implemented as 
part of a voice messaging system, such as the AUDI X 
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system, available from Lucent Technologies, Inc. The 
present invention can be Implemented on a general pur- 
pose computer with software or with special purpos 
hardware. 



Claims 

1 . A method of spotting a key segment in a voice mes- 
sage comprising the steps of: 

identifying a key segment; 

receiving a voice message; 

detecting the key segment in the voice mes- 
sage; 

tagging the voice message so as to indicate the 
location of the detected key segment within the 
voice message; 

receiving an enquiry for the key segment; and 

retrieving the key segment from the voice mes- 
sage. 



7. The method of claim 6, comprising the step of reg- 
istering the key segment by storing an identification 
and a characteristic of the key segment. 

s a The method of claim 7, wherein the characteristic 
of the key segment includes a pronunciation of the 
key segment 
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2. The method of claim 1 , wherein the step of identi- 
fying a key segment includes registering the key 
segment by storing an identification and a charac* 30 
teristic of the key segment. 

3. The method of claim 1, wherein the step of identi- 
fying a key segment includes predefining the key 
segment. 55 

4. The method of claim 1 , wherein the enquiry for the 
key segment includes speech. 

5. The method of claim 2, wherein the characteristic *o 
of the key segment includes a pronunciation of the 
key segment. 

6. A method of spotting a key segment in a voice mes- 
sage comprising the steps of: 45 

receiving a voice message; 

receiving an enquiry for a key segment; 

so 

detecting the key segment in the voice mes- 
sage; 

tagging the voice message with the location of 
the detected key s gment; and ss 

retrieving the key segment from the voice mes- 
sage. 
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